Lightweight Parsing of Classifications into Lightweight Ontologies
نویسندگان
چکیده
Understanding metadata written in natural language is a premise to successful automated integration of large scale, language-rich, classifications such as the ones used in digital libraries. We analyze the natural language labels within classification by exploring their syntactic structure, we then show how this structure can be used to detect patterns of language that can be processed by a lightweight parser with an average accuracy of 96.82%. This allows for a deeper understanding of natural language metadata semantics, which we show can improve by almost 18% the accuracy of the automatic translation of classifications into lightweight ontologies required by semantic matching, search and classification algorithms.
منابع مشابه
Lightweight Parsing of Classifications
Understanding metadata written in natural language is a crucial requirement towards the successful automated integration of large scale, language-rich, classifications such as the ones used in digital libraries. In this article we analyze natural language labels used in such classifications by exploring their syntactic structure, and then we show how this structure can be used to detect pattern...
متن کاملFaceted Lightweight Ontologies: a Formalization and some Experiments
While classifications are heavily used to categorize web content, the evolution of the web foresees a more formal structure – ontology which can serve this purpose. Ontologies are core artifacts of the Semantic Web which enable machines to use inference rules to conduct automated reasoning on data. Lightweight ontologies bridge the gap between classifications and ontologies. A lightweight ontol...
متن کاملEncoding Classifications into Lightweight Ontologies
Classifications have been used for centuries with the goal of cataloguing and searching large sets of objects. In the early days it was mainly books; lately it has also become Web pages, pictures and any kind of digital resources. Classifications describe their contents using natural language labels, an approach which has proved very effective in manual classification. However natural language ...
متن کاملEncoding Classifications into Lightweight
Classifications have been used for centuries with the goal of cataloguing and searching large sets of objects. In the early days it was mainly books; lately it has also become Web pages, pictures and any kind of electronic information items. Classifications describe their contents using natural language labels, which has proved very effective in manual classification. However natural language l...
متن کاملFrom Web Directories to Ontologies: Natural Language Processing Challenges
Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned abo...
متن کامل